Goto

Collaborating Authors

 fixed topology network


Collaborative Deep Learning in Fixed Topology Networks

Neural Information Processing Systems

There is significant recent interest to parallelize deep learning algorithms in order to handle the enormous growth in data and model sizes. While most advances focus on model parallelization and engaging multiple computing agents via using a central parameter server, aspect of data parallelization along with decentralized computation has not been explored sufficiently. In this context, this paper presents a new consensus-based distributed SGD (CDSGD) (and its momentum variant, CDMSGD) algorithm for collaborative deep learning over fixed topology networks that enables data parallelization as well as decentralized computation. Such a framework can be extremely useful for learning agents with access to only local/private data in a communication constrained environment. We analyze the convergence properties of the proposed algorithm with strongly convex and nonconvex objective functions with fixed and diminishing step sizes using concepts of Lyapunov function construction. We demonstrate the efficacy of our algorithms in comparison with the baseline centralized SGD and the recently proposed federated averaging algorithm (that also enables data parallelism) based on benchmark datasets such as MNIST, CIFAR-10 and CIFAR-100.


Reviews: Collaborative Deep Learning in Fixed Topology Networks

Neural Information Processing Systems

This paper explores a fixed peer-to-peer communication topology without parameter server. To demonstrate convergence, it shows that the Lyaounov functions that is minimized includes a regularizer term that incorporates the topology of the network. This leads to convergence rate bounds in the convex setting and convergence guarantees in the non-convex setting. This is original work of high technical quality, well positioned with a clear introduction. It is very rare to see proper convergence bounds in such a complex parallelization setting, the key to the proof is really neat (I did not check all the details).


Collaborative Deep Learning in Fixed Topology Networks

Neural Information Processing Systems

There is significant recent interest to parallelize deep learning algorithms in order to handle the enormous growth in data and model sizes. While most advances focus on model parallelization and engaging multiple computing agents via using a central parameter server, aspect of data parallelization along with decentralized computation has not been explored sufficiently. In this context, this paper presents a new consensus-based distributed SGD (CDSGD) (and its momentum variant, CDMSGD) algorithm for collaborative deep learning over fixed topology networks that enables data parallelization as well as decentralized computation. Such a framework can be extremely useful for learning agents with access to only local/private data in a communication constrained environment. We analyze the convergence properties of the proposed algorithm with strongly convex and nonconvex objective functions with fixed and diminishing step sizes using concepts of Lyapunov function construction.